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Summary 


To our knowledge, the following technical items cover all the attainable 
subjects which can be addressed in this program. The proposed robotic system has 
two major advancements in mechanical design. The modularization consideration 
attempts to incorporate the actuating device directly at each active link. The 
investigation of the modular approach is very important to fully utilize the 
flexibility of reconfiguration capability of general robotic systems. The second 
advancement is to organize multiple redundant degrees of freedom in a systematical 
fashion to extend the dexterity of the robotic system. With fourteen active 
actuators to form the head and two arms, the resulting upper body of the proposed 
robot presents an electromechanical system with unprecedented complexity, in which 
the proposed spatial planning technique (i.e. AISP) will be employed to drive the 
system. 

A novel approach for developing real-time robot vision is one of the main 
results under the framework of SANE (Sensor- Actuator NEtwork) . Dual CCD cameras 
and one frame grabber, together with two rotational degrees of freedom constitute 
the basic apparatus for prototyping the proposed vision system. Multiple 
ultrasonic sensors and proximity sensors would build up a sensor data fusion 
network which can directly and indirectly interface with actuators. The resulting 
robot behavior will be the first prototype to demonstrate the functionality of 
SANE. 


Dynamic Knowledge Evolution (DKE) is the main thrust of integrating the 
artificial intelligence technique to generate the learning capability in robot 
intelligence development. The traditional robotics techniques such as spatial 
planning, kinematic planning, dynamic calculation and control, error correction 
and fault tolerance, will be utilized as the building blocks of passive robot 
knowledge base. Submission control architecture and neural network typed parallel 
distributed processing (PDP) are two possible ways of engagement in system 
integration. Detailed development is part of the on-going process of software 
engineering. 

The final advancement of this program is the construction of a versatile 
MuMicS (Multiple Microcomputer System) . For realizing the GIC-LICs robot 
intelligence system, multiple microcomputers need to be linked together. The 
related data/address bus structure presents a unique solution for performing 
various serial/parallel operations in DMA, cache memory, coprocessor access, 

RAM/ROM memory management, timing and clock, and I/O interface. 

Software development for achieving the project's goals, as were declared in 
the last quarterly report, is the focal point of this period of R&D activities. 

In order to maintain sufficient flexibility for accommodating various 
applications, many related technology fronts were tackled, and different levels of 
progress at various areas have resulted. The efforts required for establishing 
the required development framework are beyond what was originally anticipated. 
However, despite limited resources, it is believed that the best effort has been 
made to catch up on the planned schedule. 

During this period, the upper body of SRAARS (SRA Advanced Robotic System) has 
been completely assembled. The upper body includes two articulated arms and a 
robot head with two CCD cameras as the robot's eyes. There are fourteen degrees 
of freedom in the upper body part, which are actuated by ten DC motors, two 
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stepping motors and two solenoids for controlling the pneumatic power . A 
temporary mechanical stand is currently under construction to support the upper 
body mechanism. The motion control electronic development is behind schedule due 
to the overall consideration of MuMicS, which has been reviewed several times but 
which has not yet come to any final decision. The scarcity of available 
force/torque sensors and tactile sensors has also contributed to the delay of 
finalizing the hardware implementation of SANE framework. However, some of this 
information has just arrived, which will be very helpful in the pursuit of the 
finalization of SANE realization in the near future. As for the robot vision 
development, the major bottleneck of current work is again the software 
development and the integration with other parts of the robotic system. In all, 
there have been some results in the analytical development, software engineering, 
system design and hardware implementation. 


Analytical Development 

The analytical development of the intelligent manipulation technique during 
this period concentrated on two sections: the kinematics planning and the robot 

learning. As a bridge between spatial planning and dynamics control, the 
intelligent kinematics planner (InKiP) should function as a operational buffer to 
smoothing out the transition from the position control (low impedance) to the 
force control (high impedance). Moreover, there are essentially three modes in 
robot manipulation: the inertial mode, the resistive mode and the capacitive 

mode. The InKiP design should have the capacity of flexibly shifting from one 
mode to another without causing any disturbance on the system kinematics. When 
the manipulation is in the Inertial mode, collision avoidance is not concerned. 
There will be sufficient free space for arbitrary kinematics planning. Even with 
potential obstruction of sight, the determination of collision- free traveling path 
will be dealing with the case of open convex set. For the resistive mode of robot 
manipulation, the end-effectors are either very close to or have physical contact 
with the interacted objects. For those cases of physically contacting the 
environment, the external interactive forces are limited to the order of 
frictional effects in the resistive mode. When the robot manipulation is entering 
the capacitive mode, the external interactive force or torque becomes an integral 
part of the active force control, which may play an Important role for 
accomplishing the assigned tasks. The manipulation of human arms, hands and 
fingers Is a typical example of abling to successfully transform from one mode to 
another without abrupt kinematic fluctuations. However, the abrupt fluctuation of 
the associated kinematic profile does happen In the behaviors of infants , elders , 
or the physically challenged. Hence, the successful development of InKiP not only 
would advance the performance of the robotic system, widen the applicable range of 
robotics , but also would Improve the functions of manipulation aids for the elders 
or the physically challenged. 


A. The Intelligent Kinematics Planner (InKiP) 

The analytical content of InKiP construction has two major components: the 

kinematic spline theory and the adaptive logic annealing process. As commonly 
known in robotics research, there are two domains for planning and execution of 
robot manipulation. The three-dimensional global work space describes the actual 
robot maneuvers and the interface with external environment, and the N-dimensional 
joint variable space illustrates the states of the employed control mechanism. 
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The mapping between these two spaces is not a one-to-one mapping in general. 
However, by utilizing the uniquely determined relationship between the first-order 
time derivatives at both domains through the transformation Jacobian, the resolved 
motion rate control was then developed. However, the computational requirement 
for calculating the Jacobian matrix of each point along the motion trajectory is 
substantial, which makes the real-time trajectory planning unfeasible. This 
restriction becomes more detrimental when the number of degrees of freedom of the 
robotic system and the complexity of the system's configuration increase. A 
certain interpolation operation would have to be employed such that part of the 
trajectory planning workload can be carried over to the joint space. Since the 
joint space is the direct control domain, an on-line planning function can be 
readily implemented and its effect on the control stability can be easily 
assessed. As long as the resulting performance in the global work space is within 
the acceptable tolerance range, the planning of the traveling path is preferable 
in the joint space than that of the work space. If the required robot performance 
in the work space is specified by a sequence of three-dimensional locations 
without timing-dependency, then employing various existing trajectory planning 
techniques in the joint space would be sufficient. But, if the required motion 
trajectory of the robotic system in the work space is timing dependent or even 
timing critical, then there are two choices for performing the trajectory 
planning. One can plan the path with kinematic consideration in the global work 
space first, then transform the results into the corresponded joint space, which 
may encounter some nonsmooth or unfeasible joint variable profile; or one can plan 
the kinematic profile of motion trajectory in the work space with the 
consideration of the kinematic smoothness in the joint space. The kinematic 
spline theory is developed to realize the later proposition. 


('!’). The Kinematics Spline Theory (KST) 

The principle of the kinematics spline theory (KST) is to adequately patch the 
piecewise trajectory planning elements such that the continuity and the 
differentiability can be maintained in the spatio-temporal domain. Moreover, for 
the consideration of robotics applications, the continuity and the 
differentiability of the motion trajectory will be examined in both the global 
work space and the joint space. The minimization procedure of AISP does not have 
a specific interpolation strategy, which makes the planned IKP solution 
unpredictable. The developed KST will be used to fill the gap to carry out the 
dual-space trajectory planning task. In general, a segment of trajectory in which 
one of the end effector is traveling is a segment of space curve with temporal 
correspondence. The simplest parametric representation is 

g = g(u) , u 0 £ u ^ u lf 

In the three-dimensional global work space, 

g = g(x, y, z) « g( s ) 


where 

ds = [ dx 2 + dy 2 + dz 2 ] 1 

If the initial position is S 0 [x 0 ,y 0 ,z 0 ] and the final position is S 1 [x 1 ,y 1 ,z 1 ] , 
then the line integral 
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In the N- dimensional joint space, 

g — g(ji / 32 r ••••/ 

and for the temporal consideration, 

g = g (t) , tj Z t £ t f 

The uniqueness of this segment of trajectory makes the following mappings uniquely 
determined: 

U <-> S <-> (X,y, Z) <-> (j 1# j 2 r •••» jN) <-> t 

From the system dynamics point of view, unless the robot is in a complete halt 
mode, the spatio-temporal correspondence of every two adjacent points is dependent 
upon the kinematics of the predecessor point. Some of the common factors in 
determining this dependency include the moment of inertia of both the robotic 
links and the actuation devices, the capacity of those actuators, frictions, the 
gravitational force, the Coliolis and the centripetal forces. Since it would be 
impossible to constantly evaluate and measure all these terms for planning a 
feasible trajectory, the KST is applied to determine the transition from one 
control point to the next control point, where the associated two point boundary 
valued problem with constraints on both the velocity and the acceleration levels 
is solved at joint space with the spline technique, and by determining the 
terminal kinematic profile of joint variables from the inverse kinematics mapping 
of the work space trajectory, which itself is generated by employing the spline 
technique in the work space. 


For each joint variable, an interpolation of order 3 is applied to the 
acceleration profile with the estimated maximal and minimal accelerations one the 
knots. Due to the unique characteristics of the convex hull, the interpolated 
acceleration will be contained within the specified range. If the Berstein-Bezier 
approach Is employed, then the acceleration function would by 

3 

A(u) = l A k <*» kf 3(u) 

k=0 
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where 



3 bj^ fci+2 

D(t) = Dj + V Q t (i + 1) (i + 2 ) 

where 

b 0 = A 0 

t>l = 3 (A! — Aq) 

b 2 = 3 (A 2 - 2A X + A 0 ) 

b 3 = A3 — 3 A 2 + 3A3 — Aq 

A 0 is the initial acceleration at tj 
A 3 is the final acceleration at ty 

A 1 and A 2 are the control knots which represent the 
effect of actuator limitations on force/torque 
generation. 

In order to achieve the final goal of reaching D y at ty, the following 
equation results. 
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where 


AD 

At 


V 0 


3 

= I Cj Aj 

3=0 


AD = — Dj 



C 2 “ - 2 ^ <iit)3 + 7 (it)2 

C 1 = ^ (it )3 - i (At) ^ + i (At) 

00 = -^ (At) ^ + i (At) 2 -| (At) + | 


A Q, Vq are known initial conditions of joint velocity and acceleration. If the 
final given acceleration is A 3 , and assuming the acceleration constraint has the 
relationship A 2 = — Aj., then A^ can be uniquely determined as 


Ai = 


(AD/At) - Vp - CqAq - C3A3 
C 1 “ C 2 


Also, on the other hand, if the acceleration constraints (i.e., A-^ and A 2 ) are 
given, then the final acceleration (A 3 ) can be determined accordingly. For the 
special case of a complete stop at both end points, that is, Vq = Aq = A 3 = 0, the 
acceleration constraint can be computed as 

AD 1 

Al " AT * C-l - C 2 

As for the trajectory planning in the three-dimensional work space, not only 
the cubic spline can be directly utilized to interpolate the path along each 
individual axis, but also the orientation of the trajectory can be determined by 
the Serret-Frenet formulas. As summarized in the following matrix format, the 
transition of the tangent, the normal, and the binormal vectors can be calculated 
by the kinematic profile of the trajectory. 


where 

A = [n, b, t] , the path orientation matrix 
h = the normal vector 
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the binormal vector 

the tangent vector 

0 — r k" 

r 0 0 

-k 0 0 


k = the curvature 


r *= the torsion 


k and r can be determined as 


k2 


(x^ 1 ) X 

%( 2 )] 

• 

'x(U 

1 x x( 2 )] 


[x* 1 ) 

• xcn] 

3 



[x(U x(2) 

X<3>] 


(xd) x X(2f 

9 | 

x(i) 

X X<2>] 


X = [x(t) , y (t) , z(t) ] T 


x(i) 


dt 1 


( • ) = the inner product 
( x ) = the cross product 


However, the corresponding trajectory in the work space with the interpolation 
in the joint space will not be identical to the interpolated trajectory in the 
work space. The interpolation in the work space is used as the guideline in 
searching for the "reasonable" trajectory for joint space interpolation, which 
becomes more important when the adaptive logic annealing process is employed. 

Considering the work space is divided into several subspaces by the 
hypersurfaces which are mapped from the geometrically singular surfaces in the 
joint space, from the spatial relationship to the kinematics, the planning of the 
motion trajectory would have to move from one subspace to another. Guided by the 
piecewise polynomial approximation and constrained by the limitations on the 
kinematics profile, the resulted trajectory would have the minimum transactions 
between subspaces which can reduce the possibility of redundant oscillations. 

Besides the dual-space trajectory planning, the developed KST can be used to 
derive the on-line trajectory modification scheme which is crucial for interactive 
mode switching. The resulted on-line trajectory modifier will generate stepwise 
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interpolation to accommodate the latest control parameters and determine the next 
motion command. 


m. The Adaptive Logic Annealing Process (ALAP) 

Planning the kinematics profile of the robot manipulation trajectory is a task 
with multiple possible approaches. Previous approaches of determining the profile 
with a specific optimization procedure or a heuristic search do not provide the 
sufficient spectrum for the robot to handle heterogenerous tasks. It requires an 
adaptive scheme to transform from one planning algorithm to another with smooth 
transition and consistent principles. The developing AlAP technique will meet 
this requirement. The considered methodology Is the Self -Organized Neural 
Networks (SONN) , which has three major components: the generating rule, the 

evaluation method and the search strategy. The entire logic of InKiP is modeled 
as a neural network structure. The generating rule would select the new function 
for a new node from a given set of prototype functions ; the evaluation method will 
determine how well the current logic is performing in terms of trajectory 
planning; and the search strategy will provide other available alternatives if 
there is room to improve the system performance. 

By utilizing the concept of maintaining the simplest functionality at each 
individual node, the AlAP will be suitable for developing decision machines with 
parallel distributed processing. For this particular application, the main 
integrity would be the development of geometric reasoning for the robotic systems 
with the consideration of system dynamics. The details of AlAP will be presented 
in the next quarterly report. 


B. The R obot Learning 

In order to manage the entire range of eight major operations defined as the 
robot brain functions, the robot learning mechanism has to be flexible, reasonable 
and sufficiently compact. More than often, the user's commands may not be 
precise, the encountered environment may not be identical to what has previously 
being measured, and some unexpected interruption may alter the defined operational 
sequence. An intelligent robotic system should be able to overcome these types of 
fuzziness up to a certain extent where the original assignment can still be 
partially recovered. A new technique is currently under development called 
FULOSONN, an acronym for FUzzy LOgic Self -Organized Neural Networks. The fuzzy 
concept includes formal analysis of the fuzziness in commands, control, searching 
and reasoning. The logic approach stands for the embedded expert system 
implementation with the rule -generation, the inferencing and the knowledge 
representation. The self-organized capability is formulated by the on-line 
modification of either the network topology or the connection functions, or both. 
The incorporation of neural networks would emphasize the flexible internal 
representation of the learning mechanism with or without supervision. 

The fuzzy commands concept is a very friendly user interface. For advancing 
the coordination between the human operator and the robot, the development of 
recognizing the fuzzy commands is critical. A specific fuzzifier will be 
constructed to read these types of commands and then translate them into ranges of 
quantitative descriptions with proper probability consideration. By maximizing 
the associated likelihood function, the desired actions can be selected. The 
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fuzzy control would provide the approximate control strategy to convert the 
processed fuzzy input into the defuzzified enforcing function. A combination of 
fuzzy control law and fuzzy logic process would normally be considered as the 
fuzzy controller design. Fuzzy searching is employed when a one-to-one mapping 
between the input domain and the output domain does not exist. Instead of the 
conventional heuristic search, the fuzzy searching considers the complete spectrum 
with fuzzily smoothing input information, which enlarges the probability of 
finding the feasible solution with high efficiency. Since this searching 
technique can translate fuzzy inputs into adequate executable outputs, it is also 
referred to as the "fuzzy inferencing" . As for fuzzy reasoning, it is an 
augmented version of the hypothesis-and-test paradigm. Not only the postulated 
rules have certain probabilities of being true, but also the derived outcomes 
contain conditional tolerable grey areas. One major advantage of fuzzy reasoning 
is to provide a remedy to relax the rigidity of the traditional expert system 
logic. For many application cases, the capability of "justifiably bending the 
rules" may be highly desirable. 

On the other hand, "common sense" is always the foundation of any behavior 
activation. The provision of logic stands for the operational rules for 
determining "what, when, where, who and how" of each action. As the organizer of 
executing multiple tasks, running multiple procedures and coordinating the usage 
of available resources, it has to be deterministic, predictable and yet adjustable 
to manage variable conditions. Disregarding it is defined by a single, thousands, 
or millions of rules, self -conflicting points and redundant loops should be 
eliminated. An efficient expert system would provide specific execution 
recommendation by processing the input information with induction/deduction and 
forward/backward chaining. Some of the most important logic processes include the 
linguistic and semantic logic processing, symbolic manipulations, the numerical 
computations and the stochastic processing. 

The integrated framework of FULOSONN provides most of the required techniques 
to explicitly establish the robot's learning capability. Merging these four 
fronts, the on-board robot intelligence can incorporate the basic knowledge of 
robot manipulations via man-machine interface inputs and can establish the expert 
system for standard maneuvering, manage the fuzzy commands with fuzzy controller, 
activate the fuzzy searching for unknown situations, handle the conflicting cases 
with fuzzy reasoning, organize the knowledge representation in a parallel 
distributed fashion such that some constantly failing memory will be rectified in 
a superimposed structure of layers of neural networks, which has the 
self-organized capability to modify the network topology as well as the connection 
functions . A complete development of the FULOSONN framework is beyond the scope 
of this program. A simplified version of FULOSONN implementation will be 
developed to support the learning mechanism of the SRAARS prototype. 


Overall System Design Update 

After months of investigation and evaluations, the first MuMicS prototype is 
going to utilize the well-developed LAN (Local Area Networks) technology to 
connect the GIC and those LICs . The basic port communication (parallel or serial) 
is too limited to be useful. Other low end equipment interface specifications, 
such as RS232 or IEEE488, do not have sufficient control capability over the 
communication line, which would make the vital connections among GIC and LICs very 
vulnerable. Various bus structures, such as PC bus, PC-AT bus, STD bus, VME bus, 
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MULTIBUS I or II, or FUTUREBUS, tend to be an internal connection arrangements, 
which are severely limited by the available hardware connectors and the 
Insufficient system software support. In order to consider the expandability, the 
security of GIC-LICs communication and the integration of scalable local 
Intelligence distribution, it was finally decided to employ the well-established 
PC IAN technology as the first implementation platform for demonstrating the 
MuMicS architecture. The considered PC LAN product is the Novell Netware. The 
selected connection would be a 16 -bit Ethernet. Each subsystem will be equipped 
with a specific motion controller board, the driver board for operating various 
sensors and/or detection devices, some basic memory management circuitry and 
a network communication board. The procurement of the required hardware and 
software is expected to be accomplished within the next period. 


Software Development 

Conversion from C language programs to the object oriented C++ programs and 
the development of the embedded expert system, fuzzy logic and neural networks are 
two major targets in current software development. Both of them have extensive 
involvement in state-of-the-art technology digestion, searching for available 
tools and application design. Along these two lines of work, there is no specific 
result which can be presented at this moment. As an on-going process, the goal is 
that there would be some solid software modules being constructed In the next 
period. 

The development of the robot vision system has been consolidated into one 
conference paper which had been presented at the SPIE's International Symposium on 
Advances In Intelligent Systems, November 5, 1990 In Boston MA. A copy of the 
paper is included In the Appendix. 


Hardware Development 

The completion of the upper body of the SRAARS is the major hardware 
accomplishment during this period. Figure 1 shows four different views of the 
upper body construction, Figure 2 shows a close-up of the robot head, and Figure 3 
illustrates the evolutionary path of the SRAARS development. 
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Fig. 2. The Head of the SRAARS. 
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Advanced Robotics Group of SRA, Inc. @ Sept. 1990 


Fig. 3. The Evolutionary Path of SRAARS development. 
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Intelligent' Vision 1 Process for 'Robot Manipulation 

Alexander Y.K. Chen ! and Eugene Y.S. Chen 

Scientific Research Associates, Inc. 

50 Nye Road, P.0. Box 1058 Glastonbury, CT 06033 


ABSTRACT 

A new approach Is introduced in this paper to deal with the problems of 
real-time machine vision and pattern recognition for robotic manipulations. This 
approach emphasizes three directions : (1) the developed algorithm has to be compact 

enough for embedded intelligent control implementation; (2) the computational scheme 
should be highly efficient for on-line robot reasoning and manipulations; and (3) 
the resulting system has to be sufficiently flexible to accommodate various working 
environments and to cope with some system shortcomings. The vertical integration of 
related vision hardware, image analysis software and analytical techniques (e.g. 
fuzzy logic and neural networks) together with the novel algorithms for robot 
eye -brain-hand coordination constitutes a unique robot vision system. The potential 
of more extensive hardware implementation is discussed and a wider spectrum of 
applications of the proposed robot vision system is envisioned. 


1. INTRODUCTION 

Both machine vision and robotics are multi-disciplinary, computer-related and 
application- oriented. To merge these two fields, the following challenges were 
experienced: (i) the computational requirements for vision information gathering 

and image analysis is substantial; (ii) a certain format of implicit representation 
of robotic manipulations In vision database is required; (iii) the geometric 
reasoning for robot eye -brain-hand coordination Is essential to vision-based robot 
control . 

Employing CCD or CID techniques, the resulting vision signals of one camera 
would represent a two-dimensional map of a three-dimensional projection. In order 
to extract information from the vision signals, a certain procedure of sorting, 
filtering, or transformation is required. From low level processing^* to high level 
analysis*, there are quite a few algorithms being developed in the field of machine 
vision. However, most of them would need either sophisticated computational schemes 
or a substantial amount of memory space, or both- -which prevents a direct 
implementation of those techniques in robotic system applications. Furthermore, 
some of the foundations of utilizing machine vision in robot manipulations are not 
yet available, e.g. vision-based motion analysis, virtual construction of 
three-dimensional objects, and linkage representation with incomplete vision 
information, etc. Also, It is essential to establish a reasoning mechanism which 
can digest vision Information as well as other sensor inputs for the purpose of 
robot manipulations. In this paper, a novel approach is presented to Introduce the 
integration of vision, intelligence and manipulation of a robotic system. 

In an attempt to make the entire robot vision package as compact, efficient 
and flexible as possible, the system Is divided into three phases. In Phase I, the 
images are acquired by two CCD cameras synchronously and stored in the computer 
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i memory for processing. In Phase II, several essential Image analysis procedures J 
take place, e.g. edge detection, feature extraction and object status 
identification. In Phase III, the results of pattern recognition are fed into the 
virtual representation of the working environment to determine the updated 
relationship between the robot manipulator and the obstacles, where the virtual 
representation Includes the solid models of the robot body and the related nearby 
objects. The procedure of verification and validation between the current status 
and the new status will be activated. The associated motion analysis would provide 
critical Information In supporting the robotic manipulations. For implementing this 
new approach to the parallel distributed processing (PDP) platform, the Phase 
I and the Phase II will be localized In one Local Intelligence Center (LIC) , 
and the Phase III will be performed at the Global Intelligence Center (GIC) . 

The details of the Intelligence allocation can be found in Ref. 3. 

This paper is organized in the following sequence: a new approach for 

preprocessing low level vision signals is introduced in the next section which 
presents an effective and simple method to store useful vision raw materials. The 
employed stereoscopic viewing system is explained in the third section, which is 
used mainly for determining the object location and its size. The associated 
pattern recognition procedure Is presented in the fourth section. The interactive 
coordination between the manipulator controller and the vision system is explained 
in the fifth section. Some of the preliminary results are shown in the sixth 
section. Conclusions and future researches are included In the last section. 


2. AN EFFECTIVE IMAGE REPRESENTATION FOR ROBOT VISION 

Low level vision processing schemes have been studied and discussed 
extensively in the past decade^" . Almost all the methods require substantial 
computations and the results still remain case- specific. These conditions may be 
appropriate for a dedicated Image analysis task, but will not be suitable for robot 
vision applications. Since the focusing point of robot vision is somewhat different 
than that of the ordinary Image analysis, a new approach of image representation Is 
developed which is designed for real-time mobile vision applications. 

Before the procedure of edge detection, a step of preconditioning the input 
signals is essential to any vision system. In our laboratory, the employed image 
frame grabber is capable of performing the first stage of offsetting and restoring. 

A simple histogram algorithm is then employed to smoothe out the gray levels and 
enhance the image contrast. An adaptive single threshold is considered to convert 
the pixel map of 256 gray levels into several pixel maps of binary values. By 
comparing the modified gray level of each pixel with those of its right and Its 
lower adjacent neighbors, a new pixel map is generated by assigning the 
corresponding value of each pixel to be M l" if the difference is greater than the 
given threshold, or the value M 0 H if otherwise. A global search of objects can be 
carried out with interactive threshold adjustment and binary maps comparison. 

The ideal edge representation in a pixel, map is a precise curve description of 
each visible edge projection. However, the actual converted binary pixel map would 
normally have a certain width of pixel band to represent each edge . The thinning 
process* is one possible way for improving this situation. Another difficult task 
is the reconstruction of some edges from the broken pieces. Noise filtering and 
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j edge localization are also very important. Some of those techniques can be found in 
; Refs. 9-11. The resulting two-dimensional array of pixel locations of value "l" 
represents the image of interested objects, which can be explored in searching for 
unknown objects or can be examined to determine the status of some specified 
targets. This type of two-dimensional arrays can be considered as signatures of 
moving objects as well. Figure 1 shows an example of a graphic display of an image 
representation, which contains one two-dimensional object and two three-dimensional 
objects. The detailed procedure of feature extraction and object recognition will 
be explained in later sections. 


3. OBJECT STATUS IDENTIFICATION WITH STEREOSCOPIC VIEWING SYSTEM 

Edge level image representation is regarded as the final result of the 
low-level vision process. In order to perform feature extraction and object 
recognition, many previous approaches can be found In the survey of Besl and Jain^. 
For robot vision development, it Is proposed that the complexity of high-level 
vision processing should depend upon the Intelligence level of the robot. Since the 
utmost goal of robotic system development Is to establish its autonomy such that It 
can execute simple tasks with effectively simple operations and perform delicate 
jobs with more sophisticated skills, it is therefore crucial to maintain the 
high-level vision processing flexible enough to respond to different levels of robot 
tasks. The proposed object status Identification method will examine the tasks of 
determining simple two-dimensional objects first. The three-dimensional object 
recognition and the associated occlusion problem will then be discussed. 

Assuming that the geometric descriptions of the designated targets are already 
available, the purpose of object status identification Is to determine the existence 
of the specified targets, the locations of the targets once they have been 
Identified, and the relationship between the targets and the vision system. The 
relevant parameters for describing the locations of the objects are their size, 
their positions and the orientations of the objects as shown in the Image. The 
capability of zooming Is also very important to minimize the computational 
requirement of status Identification. One possible approach for determining the 
optimal window size and location is to utilize the cluster analysis. 

After the zooming procedure, there are four parameters should be determined 
for calculating the object location: the positions of the object's reference points 

In the left and the right views, the distance between the pivoting points of the two 
cameras, and the vergence angle of the cameras. The center of the line segment 
specified by the pivoting points of two cameras Is selected to be the origin of the 
coordinate system of the robot vision. A preliminary study has been done to 
determine the correspondence between the pixel maps and the three-dimensional object 
location with different vertical and horizontal view angles of cameras. The 
calculated object location Is a function of the positions of the referenced points, 
the actual object size, the vergence angle and the view angles of cameras. The 
configuration of the developed robot vision system is shown in Figure 2 and the 
physical realization is displayed in Figure 3. There are two actuators dedicated to 
the viewing mechanism of the robot vision by providing the asynchronous and 
synchronous rotations of both cameras . Other necessary degrees of freedom would be 
supported by the motion of the robot's shoulder. The associated stereoscopic motion 
analysis has been considered in Ref. 12. However, utilizing the moving capability 
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j of both the vision system Itself as well as the connected robotic system has not yet 
. been explored. The proposed technique would tackle the motion stereo as part of the 
sensor- actuator network (l.e. the SANE framework^). Some of the basic relationships 
among variables and parameters are listed as follows: 

The object position in the vision coordinates is P(x,y,z), which is determined 
in the following manner: 

y = L/(tan a + tan £) 

L L 

x = + y tan /9 = — — y tan a 

2 2 

z = yz r /(w cos a) = yz^/(w cos /9) 


where 


tan a = 


tan V a - (x r /w) 

1 + (x r /w) tan V a 


tan P = 


tan V a + (Xjj/w) 

1 - (x x /w) tan V a 


tan a + tan /? = 2 ^1 + — tan v a + ~~ — t 1 — tan 2 V a ] 


cos a = 


w cos V a + x r sin V a 


o 2 

Jw 2 + x r 


cos /9 = 


w cos v a — Xjjsin v a 


W 2 + X , I 


X, 


= d hp • 


xr 


— d hp * p x£ 


z r — d vp * ^zr 


z l ~ d vp * p z A 
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Notations : V a 

L 


d hp 


a vp 

(P xr / P zr ) 


w 


. ♦ * r* ; ; . :*.*r j to . i-is' ! * 

Vergence angle 


Distance between 


the pivoting points of two cameras 

Horizontal distance between two adjacent pixels in 
the referenced window (in here, d hp «= 2.63 mm) 

Vertical distance between two adjacent pixels in the 
referenced window (in here, dyp = 2.09 mm) 

The corresponded pixel location of the object in the 
right referenced window 

The corresponded pixel location of the object in the 
left referenced window 


= The distance between the pivoting point of the 
camera and the corresponded referenced window (in 
here, w is set to be 1 m) . 


The signs of V a , ct and /? are all set to be positive if the angles are inward, 
and negative if they are outward. 

Once the object has been located in the stereoscopic vision, it can be 
extracted from the background and rescaled into the size of a standardized template 
for further analysis. The polar coordinates transformation of the object map is 
known to have the advantage of determining the shape of the object if it is a 
closed-form representation. By scanning the extreme points of the transformed 
curve, both the shape and the orientation of the object can be obtained. Due to the 
effect of discretization, the selected template size would affect the precision of 
object recognition. The resulted output map, namely the polar table, contains the 
information of the distances from the edge points to the origin of the polar 
coordinates. Hence, the polar table virtually converts the rotational relationship 
in the original coordinates into a translational relationship in the polar system. 
Therefore, any rotation of a closed- form object representation can be analyzed with 
this technique. 

Xn order to demonstrate the concept of the polar table approach, a simple 
two-dimensional case is employed with a template size of 21x21. Three types of 
basic geometric shapes are used: Triangle, Circle and Square. These shapes have 

their unique comer numbers: 3, 0 and 4. Once the polar table is formed, the 

comer number can be readily obtained. An algorithm of counting the local maximal 
points and their locations is derived, and the results would directly indicate the 
shape as well as the orientation of the given object. Figure 4 shows the standard 
templates and their corresponding polar tables. This simple technique can be used 
to perform the initial calibration of the eye-hand coordination. Also, it can be 
extended to consider any object with a unique two-dimensional shape or projection. 

Another similar approach for dealing with this problem is complex- log 
conformal mapping 13 . Although complex-log mapping is capable of converting both the 
rotation and the scaling into a translational representation, the stretching problem 
caused by the logarithm operations presents a more sensitive situation in 
determining the distance between the observed objects and the vision system. The 
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polar table approach shown above appears to be a more efficient method in | 

determining the orientation and the shape of the observed objects. 


4. THREE-DIMENSIONAL OBJECT RECOGNITION 

The focus of three-dimensional object recognition in robot vision is to reason 
the geometric relationship among concerned objects, such as the designated targets, 
the affected obstacles and the robot body. In order to verify the geometric 
relationship between two edges, two surfaces, or two solid volumes, the comparisons 
of distances, positions, orientations, shapes and sizes constitutes a crucial scheme 
in determining the consistency among geometric representations , where the reasoning 
rules are merely general guidelines for engaging sequential comparisons . The 
resulting object recognition technique is a progressive procedure in which simple 
objects can be readily identified and complex relationships among intertwined 
objects will be determined by examining the correlation among images from different 
view angles. Moreover, the Input from other types of sensors and the interactions 
with the manipulation of the robotic system can also be utilized to carry out those 
delicate three-dimensional object recognition tasks. 


5. MACHINE VISION AND ROBOTIC MANIPULATION 

Normally, machine vision and robot manipulation are treated as two independent 
subjects. Only a few papers (for example. Ref. 14) incorporate the vision 
information into the robot's motion control loop. The objects which the vision 
system is applied to are basically two-dimensional, well-defined configurations. On 
the other hand, there are some developments' 15-16 w hich utilize the vision 
information to guide the navigation of the mobile robot. For most of those existing 
systems additional sensor Input Is required, the working environment is assumed to 
be uniform, and the robot manipulation is considered separately if there Is any. 

What we propose here will be the first attempt to Incorporate the vision process 
Into the robot Intelligence for geometric reasoning. Since the real-time reasoning 
mechanism would directly relate to the robot manipulation control , only a minimum 
set of vision information is constantly updated to ensure the performance of robot 
operations. Also, within the reasoning process, the symbolic representations of 
objects are employed to keep the processing kernel size as compact as possible. 
Hence, after the object status has been Identified, the information will be 
converted into the status of symbolic objects which will then be used in the 
procedure of motion decisions. The entire functional flow chart Is shown In Fig. 5. 


6. SOME EXPERIMENTAL RESULTS 

For demonstrating the complete process of the developed robot vision system, a 
series of pictures are taken to represent the result after each procedure. Figure 
6a shows the original Image. Figure 6b displays the discretized picture after image 
capturing. Figure 6c illustrates the result after the histogram processing. Figure 
6d depicts the result after threshold scanning. Figure 6e shows the result after 
the thinning process. Figure 6f presents the final edge representation as the 
output of the low-level vision process. Figure 7a demonstrates the process of 
cluster analysis. Figures 7b and 7c show the two views of the corresponded 
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stereoscopic vision. Figure 7d illustrates the object within the pre-determined 
template as the result of the zooming process. Figure 7e shows the identifiable 
template, and Fig. 7f is the resulted polar table. The object is identified as the 
given object "A" with the size 24.2 cm , the distance 1.055 m, the position 
(-8.6,105.5,-9.3) cm and the orientation (175.1° ,177. 7° ,117.5°) [in Eulerian 
Angles]. By continuously taking several snapshots, the mobility statu s of the 
object "A" can be determined. If the observed object is unknown, then the mobility 
of the vision system can be commanded to vary either the vergence angle or the angle 
of the central line or both. By comparing the results of several images from 
different angles, the object recognition process will be activated to categorize the 
observed object. With two SONY XC-57 CCD cameras, Metrabyte MV1 frame grabber, 

INTEL 25MHZ 80386DX microprocessor and 80387-25 math coprocessor, the overall vision 
processing time is less than 8 seconds. It can be significantly Improved with more 
advanced hardware devices- -which are already commercially available. Hence, this 
new vision system presents a very attractive module for advanced robotic system 
development. 


7. CONCLUSIONS AND FUTURE RESEARCHES 

A new approach for developing the real-time robot vision system is presented 
in this paper. One of the major advantages of the new approach is to establish a 
flexible vision processing sequence such that the robot intelligence can 
discriminate simple tasks from those complicated jobs and would be able to activate 
only the necessary procedures to accomplish the assignment. New techniques in edge 
detection, thinning process, stereoscopic viewing, object status identification and 
potential unknown object recognition are developed. The vertical integration of all 
the new developments and the available related techniques has resulted In a powerful 
robot vision system, which will be Incorporated into the development of SRAARS . 

With a total of eighteen active joints, constituting two articulated mechanical 
arms, one robot head with vision, a lower body with cylindrical workspace and a 
potentially mobile base, the utilization of the vision capability would be an 
important part of the multi-tasking, real-time learning and parallel scalable robot 
intelligence. 

It is believed that the most appropriate principle for robot vision 
development should be based upon the same concept found in those biological systems. 
In the past two decades, the results of scientific research concerning biological 
neural systems has indicated some fascinating properties in providing the generic 
capability of distributed memory and learning. The techniques of generating the 
artificial neural networks (ANN) have been vigorously investigated throughout the 
years. Some of those ANN models have shown a certain extent of image pattern 
recognition capability 17 ' 19 . Methods such as the back-propagation algorithm 20 or 
the Boltzmann machine * 1 are capable of identifying known objects under supervision. 
Others such as the Adaptive Resonance Theory (ART ) 22 or the self- associative memory 
model 2 ^ address the unsupervised learning mechanism. With currently available 
solid-state technology, the hardware implementation of ANN circuitry has become an 
important step towards the actual realization of ANN theories. The new robot vision 
system presented herein can incorporate the ANN techniques and the fuzzy logic 
process to further strengthen its capability In fault tolerance. It will be pursued 
in the near future . 
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Figure 5. The Functional Diagram of the New Robot Vision System. 
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